Document Clustering with User feedback
نویسندگان
چکیده
In this paper, we focus on the problem of incorporating user input into an automated document clustering process to improve clustering performance. Before the start of the clustering process, the user can provide a small set of labeled documents that form the initial descriptions of the clusters. If the user provides no initial information, the clustering process has to form the initial descriptions of the clusters by itself. At the end of the clustering process, the user can navigate the clusters, assess the clustering quality and if necessary, provide feedback to the clustering process. With this user feedback, the clustering process will retrain itself to obtain new clusters that best describe the nature of the data and the desire of user. The results show that our methods for initializing the clustering process are valuable and that user feedback improves the quality of the clusters.
منابع مشابه
RRLUFF: Ranking function based on Reinforcement Learning using User Feedback and Web Document Features
Principal aim of a search engine is to provide the sorted results according to user’s requirements. To achieve this aim, it employs ranking methods to rank the web documents based on their significance and relevance to user query. The novelty of this paper is to provide user feedback-based ranking algorithm using reinforcement learning. The proposed algorithm is called RRLUFF, in which the rank...
متن کاملWeb pages ranking algorithm based on reinforcement learning and user feedback
The main challenge of a search engine is ranking web documents to provide the best response to a user`s query. Despite the huge number of the extracted results for user`s query, only a small number of the first results are examined by users; therefore, the insertion of the related results in the first ranks is of great importance. In this paper, a ranking algorithm based on the reinforcement le...
متن کاملDocument Similarity Judgment for Interactive Document Clustering
This paper investigates the task of document similarity judgment for interactive document clustering. We suppose one of the promising approaches for developing next generation of web search engines is to incorporate user feedback mechanism into constrained clustering. As a basis for designing such search engines, it is important to study the interface design that can reduce user' burden of givi...
متن کاملDeciphering cluster representations
There are several recent studies that propose search output clustering as an alternative representation method to ranked output. Users are provided with cluster representations instead of lists of titles and invited to make decisions on groups of documents. This paper discusses the diculties involved in representing clusters for usersÕ evaluation in a concise but easily interpretable form. The...
متن کاملUCSC at Relevance Feedback Track
The relevance feedback track in TREC 2009 focuses on two sub tasks: actively selecting good documents for users to provide relevance feedback and retrieving documents based on user relevance feedback. For the first task, we tried a clustering based method and the Transductive Experimental Design (TED) method proposed by Yu et al. [5]. For clustering based method, we use the K-means algorithm to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008